
<!DOCTYPE html
  PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>14.5.&nbsp;roman.py, 第 5 阶段</title>
      <link rel="stylesheet" href="../diveintopython.css" type="text/css">
      <link rev="made" href="mailto:f8dy@diveintopython.org">
      <meta name="generator" content="DocBook XSL Stylesheets V1.52.2">
      <meta name="keywords" content="Python, Dive Into Python, tutorial, object-oriented, programming, documentation, book, free">
      <meta name="description" content="Python from novice to pro">
      <link rel="home" href="../toc/index.html" title="Dive Into Python">
      <link rel="up" href="stage_1.html" title="第&nbsp;14&nbsp;章&nbsp;测试优先编程">
      <link rel="previous" href="stage_4.html" title="14.4.&nbsp;roman.py, 第 4 阶段">
      <link rel="next" href="../refactoring/index.html" title="第&nbsp;15&nbsp;章&nbsp;重构">
   </head>
   <body>
      <table id="Header" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
         <tr>
            <td id="breadcrumb" colspan="5" align="left" valign="top">导航：<a href="../index.html">起始页</a>&nbsp;&gt;&nbsp;<a href="../toc/index.html">Dive Into Python</a>&nbsp;&gt;&nbsp;<a href="stage_1.html">测试优先编程</a>&nbsp;&gt;&nbsp;<span class="thispage">roman.py, 第 5 阶段</span></td>
            <td id="navigation" align="right" valign="top">&nbsp;&nbsp;&nbsp;<a href="stage_4.html" title="上一页: “roman.py, 第 4 阶段”">&lt;&lt;</a>&nbsp;&nbsp;&nbsp;<a href="../refactoring/index.html" title="下一页: “重构”">&gt;&gt;</a></td>
         </tr>
         <tr>
            <td colspan="3" id="logocontainer">
               <h1 id="logo"><a href="../index.html" accesskey="1">深入 Python :Dive Into Python 中文版</a></h1>
               <p id="tagline">Python 从新手到专家 [Dip_5.4b_CPyUG_Release]</p>
            </td>
            <td colspan="3" align="right">
               <form id="search" method="GET" action="http://www.google.com/custom">
                  <p><label for="q" accesskey="4">Find:&nbsp;</label><input type="text" id="q" name="q" size="20" maxlength="255" value=""> <input type="submit" value="搜索"><input type="hidden" name="domains" value="woodpecker.org.cn/diveintopython"><input type="hidden" name="sitesearch" value="www.woodpecker.org.cn/diveintopython"></p>
               </form>
            </td>
         </tr>
      </table>
      <!--#include virtual="/inc/ads" -->
      <div class="section" lang="zh_cn">
         <div class="titlepage">
            <div>
               <div>
                  <h2 class="title"><a name="roman.stage5"></a>14.5.&nbsp;<tt class="filename">roman.py</tt>, 第 5 阶段
                  </h2>
               </div>
            </div>
            <div></div>
         </div>
         <div class="abstract">
            <p>现在 <tt class="function">fromRoman</tt> 对于有效输入能够正常工作了，是揭开最后一个谜底的时候了：使它正常工作于无效输入的情况下。这意味着要找出一个方法检查一个字符串是不是有效的罗马数字。这比 <tt class="function">toRoman</tt> 中<a href="stage_3.html" title="14.3.&nbsp;roman.py, 第 3 阶段">验证有效的数字输入</a>困难，但是你可以使用一个强大的工具：正则表达式。
            </p>
         </div>
         <p>如果你不熟悉正则表达式，并且没有读过 <a href="../regular_expressions/index.html" title="第&nbsp;7&nbsp;章&nbsp;正则表达式">第&nbsp;7&nbsp;章 <i>正则表达式</i></a>，现在是该好好读读的时候了。
         </p>
         <p>如你在 <a href="../regular_expressions/roman_numerals.html" title="7.3.&nbsp;个案研究：罗马字母">第&nbsp;7.3&nbsp;节 “个案研究：罗马字母”</a>中所见到的，构建罗马数字有几个简单的规则：使用字母 <tt class="literal">M</tt>, <tt class="literal">D</tt>, <tt class="literal">C</tt>, <tt class="literal">L</tt>, <tt class="literal">X</tt>, <tt class="literal">V</tt> 和 <tt class="literal">I</tt>。让我们回顾一下：
         </p>
         <div class="orderedlist">
            <ol type="1">
               <li>字符是被“加”在一起的：<tt class="literal">I</tt> 是 <tt class="constant">1</tt>，<tt class="literal">II</tt> 是 <tt class="literal">2</tt>，<tt class="literal">III</tt> 是 <tt class="literal">3</tt>。<tt class="literal">VI</tt> 是 <tt class="literal">6</tt> (看上去就是 “<span class="quote"><tt class="literal">5</tt> 加 <tt class="literal">1</tt></span>”)，<tt class="literal">VII</tt> 是 <tt class="literal">7</tt>，<tt class="literal">VIII</tt> 是 <tt class="literal">8</tt>。
               </li>
               <li>这些字符 (<tt class="literal">I</tt>, <tt class="literal">X</tt>, <tt class="literal">C</tt> 和 <tt class="literal">M</tt>) 最多可以重复三次。对于 <tt class="literal">4</tt>，你则需要利用下一个能够被5整除的字符进行减操作得到。你不能把 <tt class="literal">4</tt> 表示为 <tt class="literal">IIII</tt> 而应该表示为 <tt class="literal">IV</tt> (“<span class="quote">比 <tt class="literal">5</tt> 小 <tt class="literal">1</tt> </span>”)。<tt class="literal">40</tt> 则被写作 <tt class="literal">XL</tt> (“<span class="quote">比 <tt class="literal">50</tt> 小 <tt class="literal">10</tt></span>”)，<tt class="literal">41</tt> 表示为 <tt class="literal">XLI</tt>，<tt class="literal">42</tt> 表示为 <tt class="literal">XLII</tt>，<tt class="literal">43</tt> 表示为 <tt class="literal">XLIII</tt>，<tt class="literal">44</tt> 表示为 <tt class="literal">XLIV</tt> (“<span class="quote">比<tt class="literal">50</tt>小<tt class="literal">10</tt>，加上 <tt class="literal">5</tt> 小 <tt class="literal">1</tt></span>”)。
               </li>
               <li>类似地，对于数字 <tt class="literal">9</tt>，你必须利用下一个能够被10整除的字符进行减操作得到：<tt class="literal">8</tt> 是 <tt class="literal">VIII</tt>，而 <tt class="literal">9</tt> 是 <tt class="literal">IX</tt> (“<span class="quote">比 <tt class="literal">10</tt> 小 <tt class="literal">1</tt></span>”)，而不是 <tt class="literal">VIIII</tt> (由于 <tt class="literal">I</tt> 不能重复四次)。<tt class="literal">90</tt> 表示为 <tt class="literal">XC</tt>，<tt class="literal">900</tt> 表示为 <tt class="literal">CM</tt>。
               </li>
               <li>含五的字符不能被重复：<tt class="literal">10</tt> 应该表示为 <tt class="literal">X</tt>，而不会是 <tt class="literal">VV</tt>。<tt class="literal">100</tt> 应该表示为 <tt class="literal">C</tt>，而不是 <tt class="literal">LL</tt>。
               </li>
               <li>罗马数字一般从高位到低位书写，从左到右阅读，因此不同顺序的字符意义大不相同。<tt class="literal">DC</tt> 是 <tt class="literal">600</tt>，<tt class="literal">CD</tt> 是完全另外一个数 (<tt class="literal">400</tt>，“<span class="quote">比 <tt class="literal">500</tt> 少 <tt class="literal">100</tt></span>”)。<tt class="literal">CI</tt> 是 <tt class="literal">101</tt>，而 <tt class="literal">IC</tt> 根本就不是一个有效的罗马数字 (因为你无法从<tt class="literal">100</tt>直接减<tt class="literal">1</tt>，应该写成 <tt class="literal">XCIX</tt>，意思是 “<span class="quote">比 <tt class="literal">100</tt> 少 <tt class="literal">10</tt>，然后加上数字 <tt class="literal">9</tt>，也就是比 <tt class="literal">10</tt> 少 <tt class="literal">1</tt></span>”)。
               </li>
            </ol>
         </div>
         <div class="example"><a name="d0e34178"></a><h3 class="title">例&nbsp;14.12.&nbsp;<tt class="filename">roman5.py</tt></h3>
            <p>这个程序可以在例子目录下的<tt class="filename">py/roman/stage5/</tt> 目录中找到。
            </p>
            <p>如果您还没有下载本书附带的样例程序, 可以 <a href="http://www.woodpecker.org.cn/diveintopython/download/diveintopython-exampleszh-cn-5.4b.zip" title="Download example scripts">下载本程序和其他样例程序</a>。
            </p><pre class="programlisting">
<span class='pystring'>"""Convert to and from Roman numerals"""</span>
<span class='pykeyword'>import</span> re

<span class='pycomment'>#Define exceptions</span>
<span class='pykeyword'>class</span><span class='pyclass'> RomanError</span>(Exception): <span class='pykeyword'>pass</span>
<span class='pykeyword'>class</span><span class='pyclass'> OutOfRangeError</span>(RomanError): <span class='pykeyword'>pass</span>
<span class='pykeyword'>class</span><span class='pyclass'> NotIntegerError</span>(RomanError): <span class='pykeyword'>pass</span>
<span class='pykeyword'>class</span><span class='pyclass'> InvalidRomanNumeralError</span>(RomanError): <span class='pykeyword'>pass</span>

<span class='pycomment'>#Define digit mapping</span>
romanNumeralMap = ((<span class='pystring'>'M'</span>,  1000),
                   (<span class='pystring'>'CM'</span>, 900),
                   (<span class='pystring'>'D'</span>,  500),
                   (<span class='pystring'>'CD'</span>, 400),
                   (<span class='pystring'>'C'</span>,  100),
                   (<span class='pystring'>'XC'</span>, 90),
                   (<span class='pystring'>'L'</span>,  50),
                   (<span class='pystring'>'XL'</span>, 40),
                   (<span class='pystring'>'X'</span>,  10),
                   (<span class='pystring'>'IX'</span>, 9),
                   (<span class='pystring'>'V'</span>,  5),
                   (<span class='pystring'>'IV'</span>, 4),
                   (<span class='pystring'>'I'</span>,  1))

<span class='pykeyword'>def</span><span class='pyclass'> toRoman</span>(n):
    <span class='pystring'>"""convert integer to Roman numeral"""</span>
    <span class='pykeyword'>if</span> <span class='pykeyword'>not</span> (0 &lt; n &lt; 4000):
        <span class='pykeyword'>raise</span> OutOfRangeError, <span class='pystring'>"number out of range (must be 1..3999)"</span>
    <span class='pykeyword'>if</span> int(n) &lt;&gt; n:
        <span class='pykeyword'>raise</span> NotIntegerError, <span class='pystring'>"non-integers can not be converted"</span>

    result = <span class='pystring'>""</span>
    <span class='pykeyword'>for</span> numeral, integer <span class='pykeyword'>in</span> romanNumeralMap:
        <span class='pykeyword'>while</span> n &gt;= integer:
            result += numeral
            n -= integer
    <span class='pykeyword'>return</span> result

<span class='pycomment'>#Define pattern to detect valid Roman numerals</span>
romanNumeralPattern = <span class='pystring'>'^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'</span> <a name="roman.stage5.3.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12">

<span class='pykeyword'>def</span><span class='pyclass'> fromRoman</span>(s):
    <span class='pystring'>"""convert Roman numeral to integer"""</span>
    <span class='pykeyword'>if</span> <span class='pykeyword'>not</span> re.search(romanNumeralPattern, s):                                    <a name="roman.stage5.3.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12">
        <span class='pykeyword'>raise</span> InvalidRomanNumeralError, <span class='pystring'>'Invalid Roman numeral: %s'</span> % s

    result = 0
    index = 0
    <span class='pykeyword'>for</span> numeral, integer <span class='pykeyword'>in</span> romanNumeralMap:
        <span class='pykeyword'>while</span> s[index:index+len(numeral)] == numeral:
            result += integer
            index += len(numeral)
    <span class='pykeyword'>return</span> result
</pre><div class="calloutlist">
               <table border="0" summary="Callout list">
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.3.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">这只是 <a href="../regular_expressions/roman_numerals.html" title="7.3.&nbsp;个案研究：罗马字母">第&nbsp;7.3&nbsp;节 “个案研究：罗马字母”</a> 中讨论的匹配模版的继续。十位上可能是<tt class="literal">XC</tt> (<tt class="literal">90</tt>)，<tt class="literal">XL</tt> (<tt class="literal">40</tt>)，或者可能是 <tt class="literal">L</tt> 后面跟着 0 到 3 个 <tt class="literal">X</tt> 字符。个位则可能是 <tt class="literal">IX</tt> (<tt class="literal">9</tt>)，<tt class="literal">IV</tt> (<tt class="literal">4</tt>)，或者是一个可能是 <tt class="literal">V</tt> 后面跟着 0 到 3 个 <tt class="literal">I</tt> 字符。
                     </td>
                  </tr>
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.3.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">把所有的逻辑编码成正则表达式，检查无效罗马字符的代码就很简单了。如果 <tt class="function">re.search</tt> 返回一个对象则表示匹配了正则表达式，输入是有效的，否则输入无效。
                     </td>
                  </tr>
               </table>
            </div>
         </div>
         <p>这里你可能会怀疑，这个面目可憎的正则表达式是否真能查出错误的罗马字符表示。没关系，不必完全听我的，不妨看看下面的结果：</p>
         <div class="example"><a name="d0e34248"></a><h3 class="title">例&nbsp;14.13.&nbsp;用 <tt class="filename">romantest5.py</tt> 测试 <tt class="filename">roman5.py</tt> 的结果
            </h3><pre class="screen"><span class="computeroutput">
fromRoman should only accept uppercase input ... ok          </span><a name="roman.stage5.4.1"></a><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"><span class="computeroutput">
toRoman should always return uppercase ... ok
fromRoman should fail with malformed antecedents ... ok      </span><a name="roman.stage5.4.2"></a><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"><span class="computeroutput">
fromRoman should fail with repeated pairs of numerals ... ok </span><a name="roman.stage5.4.3"></a><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"><span class="computeroutput">
fromRoman should fail with too many repeated numerals ... ok
fromRoman should give known result with known input ... ok
toRoman should give known result with known input ... ok
fromRoman(toRoman(n))==n for all n ... ok
toRoman should fail with non-integer input ... ok
toRoman should fail with negative input ... ok
toRoman should fail with large input ... ok
toRoman should fail with 0 input ... ok

----------------------------------------------------------------------
Ran 12 tests in 2.864s

OK                                                           </span><a name="roman.stage5.4.4"></a><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12"></pre><div class="calloutlist">
               <table border="0" summary="Callout list">
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.4.1"><img src="../images/callouts/1.png" alt="1" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">有件事我未曾讲过，那就是默认情况下正则表达式大小写敏感。由于正则表达式 <tt class="varname">romanNumeralPattern</tt> 是以大写字母构造的，<tt class="function">re.search</tt> 将拒绝不全部是大写字母构成的输入。因此大写输入的检查就通过了。
                     </td>
                  </tr>
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.4.2"><img src="../images/callouts/2.png" alt="2" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">更重要的是，无效输入测试也通过了。例如，上面这个用例测试了 <tt class="literal">MCMC</tt> 之类的情形。正如你所见，这不匹配正则表达式，因此 <tt class="function">fromRoman</tt> 引发一个测试用例正在等待的 <tt class="errorcode">InvalidRomanNumeralError</tt> 异常，所以测试通过了。
                     </td>
                  </tr>
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.4.3"><img src="../images/callouts/3.png" alt="3" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">事实上，所有的无效输入测试都通过了。正则表达式捕捉了你在编写测试用例时所能预见的所有情况。</td>
                  </tr>
                  <tr>
                     <td width="12" valign="top" align="left"><a href="#roman.stage5.4.4"><img src="../images/callouts/4.png" alt="4" border="0" width="12" height="12"></a> 
                     </td>
                     <td valign="top" align="left">最终迎来了 “<span class="quote"><tt class="literal">OK</tt></span>”这个平淡的“年度大奖”，所有测试都通过后 <tt class="filename">unittest</tt> 模块就会输出它。
                     </td>
                  </tr>
               </table>
            </div>
         </div><a name="d0e34305"></a><table class="note" border="0" summary="">
            <tr>
               <td rowspan="2" align="center" valign="top" width="1%"><img src="../images/note.png" alt="注意" title="" width="24" height="24"></td>
            </tr>
            <tr>
               <td colspan="2" align="left" valign="top" width="99%">当所有测试都通过了，停止编程。</td>
            </tr>
         </table>
      </div>
      <table class="Footer" width="100%" border="0" cellpadding="0" cellspacing="0" summary="">
         <tr>
            <td width="35%" align="left"><br><a class="NavigationArrow" href="stage_4.html">&lt;&lt;&nbsp;roman.py, 第 4 阶段</a></td>
            <td width="30%" align="center"><br>&nbsp;<span class="divider">|</span>&nbsp;<a href="stage_1.html#roman.stage1" title="14.1.&nbsp;roman.py, 第 1 阶段">1</a> <span class="divider">|</span> <a href="stage_2.html" title="14.2.&nbsp;roman.py, 第 2 阶段">2</a> <span class="divider">|</span> <a href="stage_3.html" title="14.3.&nbsp;roman.py, 第 3 阶段">3</a> <span class="divider">|</span> <a href="stage_4.html" title="14.4.&nbsp;roman.py, 第 4 阶段">4</a> <span class="divider">|</span> <span class="thispage">5</span>&nbsp;<span class="divider">|</span>&nbsp;
            </td>
            <td width="35%" align="right"><br><a class="NavigationArrow" href="../refactoring/index.html">重构&nbsp;&gt;&gt;</a></td>
         </tr>
         <tr>
            <td colspan="3"><br></td>
         </tr>
      </table>
      <div class="Footer">
         <p class="copyright">Copyright © 2000, 2001, 2002, 2003, 2004 <a href="mailto:mark@diveintopython.org">Mark Pilgrim</a></p>
         <p class="copyright">Copyright © 2001, 2002, 2003, 2004, 2005, 2006, 2007 <a href="mailto:python-cn@googlegroups.com">CPyUG (邮件列表)</a></p>
      </div>
   </body>
</html>